Goto

Collaborating Authors

 independent variable


Gradient Boosting for Spatial Panel Models with Random and Fixed Effects

Balzer, Michael, Benlahlou, Adhen

arXiv.org Machine Learning

Due to the increase in data availability in urban and regional studies, various spatial panel models have emerged to model spatial panel data, which exhibit spatial patterns and spatial dependencies between observations across time. Although estimation is usually based on maximum likelihood or generalized method of moments, these methods may fail to yield unique solutions if researchers are faced with high-dimensional settings. This article proposes a model-based gradient boosting algorithm, which enables estimation with interpretable results that is feasible in low- and high-dimensional settings. Due to its modular nature, the flexible model-based gradient boosting algorithm is suitable for a variety of spatial panel models, which can include random and fixed effects. The general framework also enables data-driven model and variable selection as well as implicit regularization where the bias-variance trade-off is controlled for, thereby enhancing accuracy of prediction on out-of-sample spatial panel data. Monte Carlo experiments concerned with the performance of estimation and variable selection confirm proper functionality in low- and high-dimensional settings while real-world applications including non-life insurance in Italian districts, rice production in Indonesian farms and life expectancy in German districts illustrate the potential application.



How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMs

de Langis, Karin, Park, Jong Inn, Schramm, Andreas, Hu, Bin, Le, Khanh Chi, Mensink, Michael, Tong, Ahn Thu, Kang, Dongyeop

arXiv.org Artificial Intelligence

Large language models (LLMs) exhibit increasingly sophisticated linguistic capabilities, yet the extent to which these behaviors reflect human-like cognition versus advanced pattern recognition remains an open question. In this study, we investigate how LLMs process the temporal meaning of linguistic aspect in narratives that were previously used in human studies. Using an Expert-in-the-Loop probing pipeline, we conduct a series of targeted experiments to assess whether LLMs construct semantic representations and pragmatic inferences in a human-like manner. Our findings show that LLMs over-rely on prototypicality, produce inconsistent aspectual judgments, and struggle with causal reasoning derived from aspect, raising concerns about their ability to fully comprehend narratives. These results suggest that LLMs process aspect fundamentally differently from humans and lack robust narrative understanding. Beyond these empirical findings, we develop a standardized experimental framework for the reliable assessment of LLMs' cognitive and linguistic capabilities.


Leveraging LLM-based agents for social science research: insights from citation network simulations

Ji, Jiarui, Lei, Runlin, Pan, Xuchen, Wei, Zhewei, Sun, Hao, Lin, Yankai, Chen, Xu, Yang, Yongzheng, Li, Yaliang, Ding, Bolin, Wen, Ji-Rong

arXiv.org Artificial Intelligence

The emergence of Large Language Models (LLMs) demonstrates their potential to encapsulate the logic and patterns inherent in human behavior simulation by leveraging extensive web data pre-training. However, the boundaries of LLM capabilities in social simulation remain unclear. To further explore the social attributes of LLMs, we introduce the CiteAgent framework, designed to generate citation networks based on human-behavior simulation with LLM-based agents. CiteAgent successfully captures predominant phenomena in real-world citation networks, including power-law distribution, citational distortion, and shrinking diameter. Building on this realistic simulation, we establish two LLM-based research paradigms in social science: LLM-SE (LLM-based Survey Experiment) and LLM-LE (LLM-based Laboratory Experiment). These paradigms facilitate rigorous analyses of citation network phenomena, allowing us to validate and challenge existing theories. Additionally, we extend the research scope of traditional science of science studies through idealized social experiments, with the simulation experiment results providing valuable insights for real-world academic environments. Our work demonstrates the potential of LLMs for advancing science of science research in social science.


Downscaling human mobility data based on demographic socioeconomic and commuting characteristics using interpretable machine learning methods

Jiang, Yuqin, Popov, Andrey A., Duan, Tianle, Li, Qingchun

arXiv.org Artificial Intelligence

Understanding urban human mobility patterns at various spatial levels is essential for social science. This study presents a machine learning framework to downscale origin-destination (OD) taxi trips flows in New York City from a larger spatial unit to a smaller spatial unit. First, correlations between OD trips and demographic, socioeconomic, and commuting characteristics are developed using four models: Linear Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and Neural Networks (NN). Second, a perturbation-based sensitivity analysis is applied to interpret variable importance for nonlinear models. The results show that the linear regression model failed to capture the complex variable interactions. While NN performs best with the training and testing datasets, SVM shows the best generalization ability in downscaling performance. The methodology presented in this study provides both analytical advancement and practical applications to improve transportation services and urban development.



AutoML-Med: A Framework for Automated Machine Learning in Medical Tabular Data

Francia, Riccardo, Leone, Maurizio, Leonardi, Giorgio, Montani, Stefania, Pennisi, Marzio, Striani, Manuel, D'Alfonso, Sandra

arXiv.org Artificial Intelligence

In recent years, the advent of deep learning and, in particular, transformer-based architectures, has significantly revolutionized the field of Artificial Intelligence (AI) in many scientific domains, including computer vision, natural language processing, and sequence modeling, thanks to the increasing availability of computational power and large-scale data-sets. However, classical Machine Learning (ML) methods, such as decision trees, gradient-boosted trees, Support V ector Machines (SVMs), and regression--based techniques, continue to be considered as the state-of-the-art for tabular data, which are still nowadays widely used in healthcare, finance, industrial monitoring, and other structured-data domains. There are several reasons for this. Notably, conventional AI models tend to perform reasonably well on datasets of limited size, whereas state-of-the-art deep learning techniques typically require substantially larger amounts of data to generalize effectively. Moreover, many classical AI methods, such as regression, Bayesian approaches, rule-based systems, and tree-based models, are inherently more interpretable, a characteristic that is particularly valuable in high-stakes domains such as healthcare. In contrast, deep learning models often work as black boxes, limiting their explainability. As an example, Grinsztajn et al. [1] showed that tree-based ensembles like XGBoost and Random Forests consistently outperformed a wide range of contemporary deep learning models across dozens of medium-sized tabular datasets (


Helix 1.0: An Open-Source Framework for Reproducible and Interpretable Machine Learning on Tabular Scientific Data

Aguilar-Bejarano, Eduardo, Lea, Daniel, Sivakumar, Karthikeyan, Mase, Jimiama M., Omidvar, Reza, Li, Ruizhe, Kettle, Troy, Mitchell-White, James, Alexander, Morgan R, Winkler, David A, Figueredo, Grazziela

arXiv.org Artificial Intelligence

The massive increase in data in scientific research requires the development and application of robust tools for data analysis and m achine l earning (ML) that are findable, accessible, interoperable, re usable (FAIR) and interpretable. In domains, such as b iomaterials s cience, e ngineering, c hemistry, h ealthcare and b io sciences, data - driven discovery typically requires interdisciplinary teams . These teams collaborate to implement unbiased data pre - processing strategies, select appropriate modelling techniques, and interpret model outputs to accelerate and inform research outcomes and support rational design and decision - making. This process is often iterative, with experts providing feedback over long periods of time to refine models and optimise the methodology adopted . In cases where initial analysis identifies issues with the data, such as outliers, unbalance d data classes, or experimental measurement uncertainty, another round of data collection and pre - processing might be necessary . That means that data for the same problem are likely to be analysed multiple times using different dataset versions and methodological pipelines. For interdisciplinary co - development of analytic s, there is also a need for tools that allow domain experts to focus on interpreting and using analysis results, rather than developing code . The widespread use of ML and the overwhelming availability of thousands of community - driven open - source packages in Python and R increases the barrier for interoperable and reusable data analysis methodologies . To facilitate accurate analy tics, transparency, and modelling results comparison, there is a strong need for easy - to - use tools that automatically track data, all methodological choices, performance metrics, and corresponding results.


To MT or not to MT: An eye-tracking study on the reception by Dutch readers of different translation and creativity levels

Gerrits, Kyo, Guerberof-Arenas, Ana

arXiv.org Artificial Intelligence

This article presents the results of a pilot study involving the reception of a fictional short story translated from English into Dutch under four conditions: machine translation (MT), post-editing (PE), human translation (HT) and original source text (ST). The aim is to understand how creativity and errors in different translation modalities affect readers, specifically regarding cognitive load. Eight participants filled in a questionnaire, read a story using an eye-tracker, and conducted a retrospective think-aloud (RTA) interview. The results show that units of creative potential (UCP) increase cognitive load and that this effect is highest for HT and lowest for MT; no effect of error was observed. Triangulating the data with RTAs leads us to hypothesize that the higher cognitive load in UCPs is linked to increases in reader enjoyment and immersion. The effect of translation creativity on cognitive load in different translation modalities at word-level is novel and opens up new avenues for further research. All the code and data are available at https://github.com/INCREC/Pilot_to_MT_or_not_to_MT


asKAN: Active Subspace embedded Kolmogorov-Arnold Network

Zhou, Zhiteng, Xu, Zhaoyue, Liu, Yi, Wang, Shizhao

arXiv.org Artificial Intelligence

The Kolmogorov-Arnold Network (KAN) has emerged as a promising neural network architecture for small-scale AI+Science applications. However, it suffers from inflexibility in modeling ridge functions, which is widely used in representing the relationships in physical systems. This study investigates this inflexibility through the lens of the Kolmogorov-Arnold theorem, which starts the representation of multivariate functions from constructing the univariate components rather than combining the independent variables. Our analysis reveals that incorporating linear combinations of independent variables can substantially simplify the network architecture in representing the ridge functions. Inspired by this finding, we propose active subspace embedded KAN (asKAN), a hierarchical framework that synergizes KAN's function representation with active subspace methodology. The architecture strategically embeds active subspace detection between KANs, where the active subspace method is used to identify the primary ridge directions and the independent variables are adaptively projected onto these critical dimensions. The proposed asKAN is implemented in an iterative way without increasing the number of neurons in the original KAN. The proposed method is validated through function fitting, solving the Poisson equation, and reconstructing sound field. Compared with KAN, asKAN significantly reduces the error using the same network architecture. The results suggest that asKAN enhances the capability of KAN in fitting and solving equations in the form of ridge functions.